scripts: complete slime-exact port of all scripts + gpt-oss 20B support by aoshen02 · Pull Request #260 · vllm-project/vime

aoshen02 · 2026-06-16T14:41:18Z

Summary

This PR consolidates three work streams:

1. slime-exact translation of run scripts (original scope)

23 new scripts + 6 existing updated to match slime@cutoff
sglang→vllm prefix swap, _slime→_vime checkpoint paths, EP boolean conversion, speculative config merge to JSON
Translation rules per SGLANG_TO_VLLM_TRANSLATION.md

2. GPT-OSS 20B support

Three fixes required to run GPT-OSS 20B RLHF on vLLM backend:

hf_weight_iterator_bridge.py: match Megatron-Bridge 0.5.0 API — maybe_modify_converted_hf_weight gained a 4th hf_state_dict parameter; the monkey-patch accepted only 3, causing TypeError during weight sync. Same fix submitted upstream: fix(gpt-oss): update _patch_bridge_expert_cache_to_cpu to match Megatron-Bridge API THUDM/slime#2113.
--hf-checkpoint fused BF16: vLLM _load_weights_other expects gate_up_proj [E, hidden, 2×ffn] (fused). Old per-expert split format causes KeyError on bias loading. tools/convert_gpt_oss_to_fused.py converts without re-running slow MXFP4 dequantization.
--qkv-format bshd: GPT-OSS learnable softmax + qkv_format=thd disables all TE attention backends. bshd avoids this; replaced --use-dynamic-batch-size with --seq-length 10240.

3. Restore deleted examples and scripts (from PR #220)

examples/coding_agent_rl/, examples/geo3k_vlm/, examples/multi_agent/, examples/train_infer_mismatch_helper/
scripts/run-glm4.7-30B-A3B.sh, run-glm4.7-355B-A32B.sh, run-minimax-m2.sh, run-qwen3-30B-A3B.sh

4. Precise pkill pattern (all scripts)

Replace pkill -9 vllm with pkill -9 -f '[v]llm serve|VLL[M]::' — targets only vllm serve and Ray VLLM:: actors, avoiding accidental kill in colocated mode.

Test plan

run-gpt-oss-20B.sh: validate rollout starts and weight sync completes (step 1)
Other scripts: bash -n syntax check

🤖 Generated with Claude Code

Restore files that were either deleted by vllm-project#126 ("trim examples to qwen3 only") or never synced from slime: **Reverted from pre-vllm-project#126 (translated):** - scripts/low_precision/run-qwen3-4b-fp8.sh - scripts/low_precision/run-qwen3-30b-a3b-fp8.sh - scripts/run-glm4-9B.sh - scripts/run-moonlight-16B-A3B.sh - scripts/run-qwen3-4B-base-sft.sh - scripts/run-qwen3-32B.sh - scripts/run-qwen3.5-35B-A3B-sft.sh **New from slime@44d29ee (translated):** - docs/en/get_started/agent.md - examples/fully_async/run-qwen2.5-0.5B-fully_async.sh All sglang engine flags translated to vllm equivalents (§2.4). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Standardize all scripts to use the bracket-escaped pkill pattern that avoids matching pkill itself and also catches vLLM's renamed subprocesses (VLLM::EngineCore, VLLM::Worker_TP*). Matches the canonical pattern in command_utils.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request adds and updates several training and rollout shell scripts for various models, including Qwen, Kimi-K2, DeepSeek-R1, and GLM, to support low-precision training (INT4 and FP8) and integrate vLLM. The review feedback highlights several critical issues, including a missing trailing backslash in run-kimi-k2-Instruct.sh that breaks the Ray job submission, incorrect relative source paths for model configurations across multiple scripts, leftover paths and package names from the 'slime' repository, a typo in the Python buffering environment variable, and a leading blank line before the shebang in run-mimo-7B-rl-eagle.sh.

gemini-code-assist · 2026-06-16T14:43:17Z

+   --actor-num-nodes 32 \
+   --actor-num-gpus-per-node 8 \
+   --colocate \
+   --update-weight-buffer-size $(( 4 * 512 * 1024 * 1024))


The line is missing a trailing backslash \. This will cause the shell to treat the subsequent lines as a separate command, breaking the ray job submit execution.

Suggested change

--update-weight-buffer-size $(( 4 * 512 * 1024 * 1024))

--update-weight-buffer-size $(( 4 * 512 * 1024 * 1024)) \

gemini-code-assist · 2026-06-16T14:43:17Z

+   # --global-batch-size 256
+
+   --over-sampling-batch-size 256
+   --dynamic-sampling-filter-path slime.rollout.filter_hub.dynamic_sampling_filters.check_reward_nonzero_std


The package has been renamed/translated from slime to vime (as seen in the codebase structure, e.g., vime/rollout/vllm_rollout.py). Using slime.rollout... will result in a ModuleNotFoundError. Please update this path to use vime instead of slime.

Suggested change

--dynamic-sampling-filter-path slime.rollout.filter_hub.dynamic_sampling_filters.check_reward_nonzero_std

--dynamic-sampling-filter-path vime.rollout.filter_hub.dynamic_sampling_filters.check_reward_nonzero_std

gemini-code-assist · 2026-06-16T14:43:17Z

+
+ray job submit --address="http://127.0.0.1:8265" \
+   --runtime-env-json="${RUNTIME_ENV_JSON}" \
+   -- python3 /personal/slime/slime/train.py \


The script is executing /personal/slime/slime/train.py which is a leftover path from the slime repository. It should be updated to train.py to run the vime training script in the current workspace, consistent with the other run scripts.

Suggested change

-- python3 /personal/slime/slime/train.py \

-- python3 train.py \

gemini-code-assist · 2026-06-16T14:43:17Z

+echo "HAS_NVLINK: $HAS_NVLINK (detected $NVLINK_COUNT NVLink references)"
+
+SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
+source "${SCRIPT_DIR}/../scripts/models/qwen3-30B-A3B.sh"


The source path ../scripts/models/qwen3-30B-A3B.sh is incorrect. Since this script is located in scripts/low_precision/, .. resolves to scripts/, making the path scripts/scripts/models/... which does not exist. It should be ../models/qwen3-30B-A3B.sh.

Suggested change

source "${SCRIPT_DIR}/../scripts/models/qwen3-30B-A3B.sh"

source "${SCRIPT_DIR}/../models/qwen3-30B-A3B.sh"

gemini-code-assist · 2026-06-16T14:43:18Z

+echo "HAS_NVLINK: $HAS_NVLINK (detected $NVLINK_COUNT NVLink references)"
+
+SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
+source "${SCRIPT_DIR}/../scripts/models/qwen3-4B.sh"


The source path ../scripts/models/qwen3-4B.sh is incorrect. Since this script is located in scripts/low_precision/, .. resolves to scripts/, making the path scripts/scripts/models/... which does not exist. It should be ../models/qwen3-4B.sh.

Suggested change

source "${SCRIPT_DIR}/../scripts/models/qwen3-4B.sh"

source "${SCRIPT_DIR}/../models/qwen3-4B.sh"

gemini-code-assist · 2026-06-16T14:43:18Z


 SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)"
-source "${SCRIPT_DIR}/models/qwen2.5-0.5B.sh"
+source "${SCRIPT_DIR}/scripts/models/qwen2.5-0.5B.sh"


The source path was incorrectly changed to ${SCRIPT_DIR}/scripts/models/.... Since this script is located in scripts/, ${SCRIPT_DIR} is already scripts/, making the path scripts/scripts/models/... which does not exist. It should be reverted to ${SCRIPT_DIR}/models/qwen2.5-0.5B.sh.

Suggested change

source "${SCRIPT_DIR}/scripts/models/qwen2.5-0.5B.sh"

source "${SCRIPT_DIR}/models/qwen2.5-0.5B.sh"

gemini-code-assist · 2026-06-16T14:43:18Z

 set -ex

-export PYTHONUNBUFFERED=1
+export PYTHONBUFFERED=16


The environment variable PYTHONBUFFERED=16 is a typo. The standard Python environment variable to control buffering is PYTHONUNBUFFERED (typically set to 1 to disable buffering). Python does not recognize PYTHONBUFFERED.

Suggested change

export PYTHONBUFFERED=16

export PYTHONUNBUFFERED=1

gemini-code-assist · 2026-06-16T14:43:18Z

+
+#!/bin/bash


There is a leading empty line before the shebang #!/bin/bash. For a shebang to be recognized by the OS kernel when executing the script directly, it must be the absolute first line of the file.

Suggested change

#!/bin/bash

#!/bin/bash

gemini-code-assist · 2026-06-16T14:43:18Z

 # 229B MoE, 256 experts -> requires many GPUs
 # Typical config: TP=2, PP=2, EP=4, training side 16 GPUs (2 nodes x 8 GPUs)
-# Inference side: vLLM on separate GPUs, EP=16+
+# Inference side: SGLang on separate GPUs, EP=16+


The comment was updated to refer to SGLang instead of vLLM. Since this PR is migrating the codebase from SGLang to vLLM, this comment is backwards and misleading. It should refer to vLLM.

Suggested change

# Inference side: SGLang on separate GPUs, EP=16+

# Inference side: vLLM on separate GPUs, EP=16+

Translate all slime scripts to vime following SGLANG_TO_VLLM_TRANSLATION.md: - sglang→vllm prefix swap for CLI flags and variables - _slime→_vime for checkpoint paths - EP: --sglang-ep-size N → --vllm-enable-expert-parallel (boolean) - Speculative: multi-param → --vllm-speculative-config JSON (§5.2) - Delete genuinely sglang-coupled params (DP-attention, DeepEP, NSA, etc.) - flashinfer → FLASHINFER case fix (§2.4) 23 new scripts + 6 existing updated to match slime@cutoff. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…cripts The FP8 scripts used `${SCRIPT_DIR}/../scripts/models/` which resolves to `scripts/scripts/models/` (non-existent). Changed to `../models/` to match the INT4 scripts. Same fix as slime PR #2094. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Three fixes needed to run GPT-OSS 20B RLHF on vLLM backend: 1. hf_weight_iterator_bridge: match Megatron-Bridge 0.5.0 API _patch_bridge_expert_cache_to_cpu monkey-patches GPTOSSBridge. maybe_modify_converted_hf_weight gained a 4th `hf_state_dict` parameter; the patched wrapper only accepted 3, causing TypeError during weight sync. 2. run-gpt-oss-20B: point --hf-checkpoint at fused BF16 format vLLM's _load_weights_other expects gate_up_proj [E, hidden, 2*ffn] (fused). The old per-expert split format (experts.{e}.gate_proj.weight) causes KeyError on bias loading. Use tools/convert_gpt_oss_to_fused.py to convert an existing per-expert checkpoint, or re-run preprocess_gpt_oss.py to produce fused format directly. 3. run-gpt-oss-20B: add --qkv-format bshd + fix seq-length GPT-OSS uses learnable softmax (sink attention). TransformerEngine disables all attention backends when softmax_type=learnable and qkv_format=thd (packed sequences). --qkv-format bshd avoids this. --use-dynamic-batch-size is incompatible with bshd; replaced with fixed --seq-length 10240 (covers 8192 max response + prompt headroom). tools/convert_gpt_oss_to_fused.py: new tool to convert per-expert BF16 checkpoint (output of old preprocess_gpt_oss.py) to the fused HF format expected by vLLM without re-running the slow MXFP4 dequantization. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

pkill -9 vllm matches any process named "vllm" and can inadvertently kill unrelated vllm processes (e.g. background services). Use the same pattern as PR vllm-project#220 which targets only vllm serve and Ray VLL[M]:: actors: pkill -9 -f '[v]llm serve|VLL[M]::' Also updates the inline form used in multi-node SSH worker restart commands (run-qwen3-235B-A22B*.sh, run-qwen3.5-27B.sh, etc.). Skipped: scripts/run-gpt-oss-20B.sh (uses pkill -9 -f "vllm serve" already), scripts/run-minimax-m2.sh and run-glm4.7-*.sh (already used -f "vllm serve"). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

…b300-complete-port # Conflicts: # scripts/run-glm4-9B.sh # scripts/run-moonlight-16B-A3B.sh # scripts/run-qwen3-32B.sh # scripts/run-qwen3-4B-base-sft.sh # scripts/run-qwen3.5-35B-A3B-sft.sh

read-the-docs-community · 2026-06-23T04:14:33Z

Documentation build overview

📚 vime | 🛠️ Build #33263922 | 📁 Comparing 7429dea against latest (2864b34)

🔍 Preview build

26 files changed · ± 26 modified

± Modified

AMD-specific script is out of scope for the gb300-complete-port PR. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

aoshen02 and others added 2 commits June 9, 2026 15:38

gemini-code-assist Bot reviewed Jun 16, 2026

View reviewed changes

aoshen02 force-pushed the scripts/gb300-complete-port branch 2 times, most recently from d5c572e to e5d6f3a Compare June 16, 2026 14:50

aoshen02 closed this Jun 16, 2026

aoshen02 force-pushed the scripts/gb300-complete-port branch from e5d6f3a to 2864b34 Compare June 16, 2026 14:53

aoshen02 reopened this Jun 16, 2026

aoshen02 mentioned this pull request Jun 21, 2026

[RFC] VIME Roadmap #11

Open

15 tasks

aoshen02 and others added 3 commits June 21, 2026 14:11

Merge remote-tracking branch 'origin/restore-examples' into scripts/g…

486ee58

…b300-complete-port # Conflicts: # scripts/run-glm4-9B.sh # scripts/run-moonlight-16B-A3B.sh # scripts/run-qwen3-32B.sh # scripts/run-qwen3-4B-base-sft.sh # scripts/run-qwen3.5-35B-A3B-sft.sh

aoshen02 changed the title ~~scripts: complete slime-exact translation of all 29 run scripts~~ scripts: complete slime-exact port of all scripts + gpt-oss 20B support Jun 21, 2026

chore(scripts): remove run-qwen3-4B-amd.sh from this PR

7429dea

AMD-specific script is out of scope for the gb300-complete-port PR. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scripts: complete slime-exact port of all scripts + gpt-oss 20B support#260

scripts: complete slime-exact port of all scripts + gpt-oss 20B support#260
aoshen02 wants to merge 8 commits into
vllm-project:mainfrom
aoshen02:scripts/gb300-complete-port

aoshen02 commented Jun 16, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

read-the-docs-community Bot commented Jun 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	--update-weight-buffer-size $(( 4 * 512 * 1024 * 1024))
	--update-weight-buffer-size $(( 4 * 512 * 1024 * 1024)) \

	--dynamic-sampling-filter-path slime.rollout.filter_hub.dynamic_sampling_filters.check_reward_nonzero_std
	--dynamic-sampling-filter-path vime.rollout.filter_hub.dynamic_sampling_filters.check_reward_nonzero_std

	-- python3 /personal/slime/slime/train.py \
	-- python3 train.py \

	source "${SCRIPT_DIR}/../scripts/models/qwen3-30B-A3B.sh"
	source "${SCRIPT_DIR}/../models/qwen3-30B-A3B.sh"

	source "${SCRIPT_DIR}/../scripts/models/qwen3-4B.sh"
	source "${SCRIPT_DIR}/../models/qwen3-4B.sh"

	source "${SCRIPT_DIR}/scripts/models/qwen2.5-0.5B.sh"
	source "${SCRIPT_DIR}/models/qwen2.5-0.5B.sh"

	# Inference side: SGLang on separate GPUs, EP=16+
	# Inference side: vLLM on separate GPUs, EP=16+

Conversation

aoshen02 commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. slime-exact translation of run scripts (original scope)

2. GPT-OSS 20B support

3. Restore deleted examples and scripts (from PR #220)

4. Precise pkill pattern (all scripts)

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

read-the-docs-community Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation build overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aoshen02 commented Jun 16, 2026 •

edited

Loading

read-the-docs-community Bot commented Jun 23, 2026 •

edited

Loading